AITopics | dirichlet process mixture

Collaborating Authors

dirichlet process mixture

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models

Porwal, Anupreet, Rodriguez, Abel

arXiv.org Artificial IntelligenceNov-1-2024

This paper introduces Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models. These priors are extensions of traditional mixtures of $g$ priors that allow for differential shrinkage for various (data-selected) blocks of parameters while fully accounting for the predictors' correlation structure, providing a bridge between the literatures on model selection and continuous shrinkage priors. We show that Dirichlet process mixtures of block $g$ priors are consistent in various senses and, in particular, that they avoid the conditional Lindley ``paradox'' highlighted by Som et al.(2016). Further, we develop a Markov chain Monte Carlo algorithm for posterior inference that requires only minimal ad-hoc tuning. Finally, we investigate the empirical performance of the prior in various real and simulated datasets. In the presence of a small number of very large effects, Dirichlet process mixtures of block $g$ priors lead to higher power for detecting smaller but significant effects without only a minimal increase in the number of false discoveries.

bayes factor, coefficient, procedure, (14 more...)

arXiv.org Artificial Intelligence

2411.00471

Country:

North America > United States > Ohio (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback

f7e6c85504ce6e82442c770f7c8606f0-Reviews.html

Neural Information Processing SystemsMar-14-2024, 00:11:12 GMT

The title of this paper is much like the paper itself: to-the-point, descriptive, and readable. "A simple example of Dirichlet process mixture inconsistency for the number of components" delivers on its promise by providing two easy-to-understand demonstrations of the severity of the problem of using Dirichlet process mixtures to estimate the number of components in a mixture model. The authors start by demonstrating that making such a component-cardinality estimate is widespread in the literature (and therefore a problem deserving of interest), briefly describe the Dirichlet process mixture (DPM) model (with particular emphasis on the popular normal likelihood case), and then demonstrate with a simple single-component mixture example how poorly estimation of component cardinality can go (their convincing answer: very poorly). Not only was the paper enjoyable to read but, refreshingly, didn't try to fit 20 pages of material into an 8 page limit. One potential criticism of this paper is that this result should be well-known in some sense in the community.

dpm, future work, proposition 3, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

Entropy regularization in probabilistic clustering

Franzolini, Beatrice, Rebaudo, Giovanni

arXiv.org Machine LearningJul-19-2023

Bayesian nonparametric mixture models are widely used to cluster observations. However, one major drawback of the approach is that the estimated partition often presents unbalanced clusters' frequencies with only a few dominating clusters and a large number of sparsely-populated ones. This feature translates into results that are often uninterpretable unless we accept to ignore a relevant number of observations and clusters. Interpreting the posterior distribution as penalized likelihood, we show how the unbalance can be explained as a direct consequence of the cost functions involved in estimating the partition. In light of our findings, we propose a novel Bayesian estimator of the clustering configuration. The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters and enhances interpretability. The procedure takes the form of entropy-regularization of the Bayesian estimate. While being computationally convenient with respect to alternative strategies, it is also theoretically justified as a correction to the Bayesian loss function used for point estimation and, as such, can be applied to any posterior distribution of clusters, regardless of the specific model used.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2307.10065

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

How do Dirichlet processes work(Machine Learning +Statistics)

#artificialintelligenceAug-28-2022, 10:00:05 GMT

Abstract: Understanding the spatio-temporal patterns of the coronavirus disease 2019 (COVID-19) is essential to construct public health interventions. Spatially referenced data can provide richer opportunities to understand the mechanism of the disease spread compared to the more often encountered aggregated count data. We propose a spatio-temporal Dirichlet process mixture model to analyze confirmed cases of COVID-19 in an urban environment. Our method can detect unobserved cluster centers of the epidemics, and estimate the space-time range of the clusters that are useful to construct a warning system. Furthermore, our model can measure the impact of different types of landmarks in the city, which provides an intuitive explanation of disease spreading sources from different time points.

algorithm, dirichlet process mixture, dirichlet process work, (11 more...)

#artificialintelligence

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Add feedback

Bayesian nonparametric mixture inconsistency for the number of components: How worried should we be in practice?

Chaumeny, Yannis, Moris, Johan van der Molen, Davison, Anthony C., Kirk, Paul D. W.

arXiv.org Machine LearningJul-29-2022

We consider the Bayesian mixture of finite mixtures (MFMs) and Dirichlet process mixture (DPM) models for clustering. Recent asymptotic theory has established that DPMs overestimate the number of clusters for large samples and that estimators from both classes of models are inconsistent for the number of clusters under misspecification, but the implications for finite sample analyses are unclear. The final reported estimate after fitting these models is often a single representative clustering obtained using an MCMC summarisation technique, but it is unknown how well such a summary estimates the number of clusters. Here we investigate these practical considerations through simulations and an application to gene expression data, and find that (i) DPMs overestimate the number of clusters even in finite samples, but only to a limited degree that may be correctable using appropriate summaries, and (ii) misspecification can lead to considerable overestimation of the number of clusters in both DPMs and MFMs, but results are nevertheless often still interpretable. We provide recommendations on MCMC summarisation and suggest that although the more appealing asymptotic properties of MFMs provide strong motivation to prefer them, results obtained using MFMs and DPMs are often very similar in practice.

artificial intelligence, machine learning, mixture model, (15 more...)

arXiv.org Machine Learning

2207.14717

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Clustering consistency with Dirichlet process mixtures

Ascolani, Filippo, Lijoi, Antonio, Rebaudo, Giovanni, Zanella, Giacomo

arXiv.org Machine LearningMay-25-2022

Dirichlet process mixtures are flexible non-parametric models, particularly suited to density estimation and probabilistic clustering. In this work we study the posterior distribution induced by Dirichlet process mixtures as the sample size increases, and more specifically focus on consistency for the unknown number of clusters when the observed data are generated from a finite mixture. Crucially, we consider the situation where a prior is placed on the concentration parameter of the underlying Dirichlet process. Previous findings in the literature suggest that Dirichlet process mixtures are typically not consistent for the number of clusters if the concentration parameter is held fixed and data come from a finite mixture. Here we show that consistency for the number of clusters can be achieved if the concentration parameter is adapted in a fully Bayesian way, as commonly done in practice. Our results are derived for data coming from a class of finite mixtures, with mild assumptions on the prior for the concentration parameter and for a variety of choices of likelihood kernels for the mixture.

artificial intelligence, dirichlet process mixture, machine learning, (1 more...)

arXiv.org Machine Learning

doi: 10.1093/biomet/asac051

2205.12924

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Clustering of non-Gaussian data by variational Bayes for normal inverse Gaussian mixture models

Takekawa, Takashi

arXiv.org Machine LearningSep-13-2020

Finite mixture models, typically Gaussian mixtures, are well known and widely used as model-based clustering. In practical situations, there are many non-Gaussian data that are heavy-tailed and/or asymmetric. Normal inverse Gaussian (NIG) distributions are normal-variance mean which mixing densities are inverse Gaussian distributions and can be used for both haavy-tail and asymmetry. For NIG mixture models, both expectation-maximization method and variational Bayesian (VB) algorithms have been proposed. However, the existing VB algorithm for NIG mixture have a disadvantage that the shape of the mixing density is limited. In this paper, we propose another VB algorithm for NIG mixture that improves on the shortcomings. We also propose an extension of Dirichlet process mixture models to overcome the difficulty in determining the number of clusters in finite mixture models. We evaluated the performance with artificial data and found that it outperformed Gaussian mixtures and existing implementations for NIG mixtures, especially for highly non-normative data.

artificial intelligence, machine learning, mixture model, (16 more...)

arXiv.org Machine Learning

2009.06002

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Nearest Neighbor Dirichlet Process

Chattopadhyay, Shounak, Chakraborty, Antik, Dunson, David B.

arXiv.org Machine LearningMar-17-2020

There is a rich literature on Bayesian nonparametric methods for unknown densities. The most popular approach relies on Dirichlet process mixture models. These models characterize the unknown density as a kernel convolution with an unknown almost surely discrete mixing measure, which is given a Dirichlet process prior. Such models are very flexible and have good performance in many settings, but posterior computation relies on Markov chain Monte Carlo algorithms that can be complex and inefficient. As a simple and general alternative, we propose a class of nearest neighbor-Dirichlet processes. The approach starts by grouping the data into neighborhoods based on standard algorithms. Within each neighborhood, the density is characterized via a Bayesian parametric model, such as a Gaussian with unknown parameters. Assigning a Dirichlet prior to the weights on these local kernels, we obtain a simple pseudo-posterior for the weights and kernel parameters. A simple and embarrassingly parallel Monte Carlo algorithm is proposed to sample from the resulting pseudo-posterior for the unknown density. Desirable asymptotic properties are shown, and the methods are evaluated in simulation studies and applied to a motivating dataset in the context of classification.

dirichlet process mixture, equation, process mixture, (12 more...)

arXiv.org Machine Learning

2003.07953

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Austria > Vienna (0.14)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Sequence Alignment with Dirichlet Process Mixtures

Kazlauskaite, Ieva, Ustyuzhaninov, Ivan, Ek, Carl Henrik, Campbell, Neill D. F.

arXiv.org Machine LearningNov-26-2018

We present a probabilistic model for unsupervised alignment of high-dimensional time-warped sequences based on the Dirichlet Process Mixture Model (DPMM). We follow the approach introduced in (Kazlauskaite, 2018) of simultaneously representing each data sequence as a composition of a true underlying function and a time-warping, both of which are modelled using Gaussian processes (GPs) (Rasmussen, 2005), and aligning the underlying functions using an unsupervised alignment method. In (Kazlauskaite, 2018) the alignment is performed using the GP latent variable model (GP-LVM) (Lawrence, 2005) as a model of sequences, while our main contribution is extending this approach to using DPMM, which allows us to align the sequences temporally and cluster them at the same time. We show that the DPMM achieves competitive results in comparison to the GP-LVM on synthetic and real-world data sets, and discuss the different properties of the estimated underlying functions and the time-warps favoured by these models.

artificial intelligence, machine learning, sequence, (15 more...)

arXiv.org Machine Learning

1811.10689

Country:

Europe > United Kingdom > England (0.15)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Mixture modeling on related samples by $\psi$-stick breaking and kernel perturbation

Soriano, Jacopo, Ma, Li

arXiv.org Machine LearningApr-16-2017

There has been great interest recently in applying nonparametric kernel mixtures in a hierarchical manner to model multiple related data samples jointly. In such settings several data features are commonly present: (i) the related samples often share some, if not all, of the mixture components but with differing weights, (ii) only some, not all, of the mixture components vary across the samples, and (iii) often the shared mixture components across samples are not aligned perfectly in terms of their location and spread, but rather display small misalignments either due to systematic cross-sample difference or more often due to uncontrolled, extraneous causes. Properly incorporating these features in mixture modeling will enhance the efficiency of inference, whereas ignoring them not only reduces efficiency but can jeopardize the validity of the inference due to issues such as confounding. We introduce two techniques for incorporating these features in modeling related data samples using kernel mixtures. The first technique, called $\psi$-stick breaking, is a joint generative process for the mixing weights through the breaking of both a stick shared by all the samples for the components that do not vary in size across samples and an idiosyncratic stick for each sample for those components that do vary in size. The second technique is to imbue random perturbation into the kernels, thereby accounting for cross-sample misalignment. These techniques can be used either separately or together in both parametric and nonparametric kernel mixtures. We derive efficient Bayesian inference recipes based on MCMC sampling for models featuring these techniques, and illustrate their work through both simulated data and a real flow cytometry data set in prediction/estimation, cross-sample calibration, and testing multi-sample differences.

artificial intelligence, kernel parameter, machine learning, (17 more...)

arXiv.org Machine Learning

1704.04839

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback